Using GVF for Clinical Annotation of Personal Genomes
نویسندگان
چکیده
Accurately describing the contents of Next Generation Sequencing (NGS) results is vital to both research and clinical analysis of genomic data. Genomics and medicine use different, often incompatible terminologies and standards to describe sequence variants and their functional effects. This creates an information bottleneck that prevents efficient translation of genome scale nextgeneration sequence (NGS) information into the clinic. While the Variant Call Format (VCF) has met some of these challenges, with regards to describing the results of variant calling pipelines, it lacks the structure needed for detailed annotation of the consequences of sequence alterations. To incorporate genomic results into electronic health records (EHR), the results must also be defined in ways that are compatible with existing medical informatics systems. The Genome Variation Format (GVF) is an extension of the existing genome annotation format GFF3, which uses ontologies to capture the semantic nature of the information on sequence features. GVF uses the Sequence Ontology (SO) to define the type of sequence alteration, the genomic features that are changed and the effect of the change. We have extended and remodeled the Sequence Ontology to include and define more terms that describe the consequence of a variant upon genomic features in support of the Ensemble variation databases. GVF represents genome annotations for clinical applications using existing EHR standards as defined by the international standards consortium: Health Level 7. This means that GVF can describe the information that defines genetic tests, allowing seamless incorporation of genomic data into pre-existing EHR systems. Here we demonstrate the power of GVF to describe, to exchange, and to empower clinical interpretation of personal genome data through an extension of the GVF specification is called GVFClin. The Sequence Ontology Project maintains and updates the specification and provides the underlying structure that describes sequence features, sequence alterations and variant effects and their relationships to each other. The specification is available on the web at http://www.sequenceontology.org/resources/gvfclin.html.
منابع مشابه
The personal genome browser: visualizing functions of genetic variants
Advances in high-throughput sequencing technologies have brought us into the individual genome era. Projects such as the 1000 Genomes Project have led the individual genome sequencing to become more and more popular. How to visualize, analyse and annotate individual genomes with knowledge bases to support genome studies and personalized healthcare is still a big challenge. The Personal Genome B...
متن کاملgSearch: a fast and flexible general search tool for whole-genome sequencing
BACKGROUND Various processes such as annotation and filtering of variants or comparison of variants in different genomes are required in whole-genome or exome analysis pipelines. However, processing different databases and searching among millions of genomic loci is not trivial. RESULTS gSearch compares sequence variants in the Genome Variation Format (GVF) or Variant Call Format (VCF) with a...
متن کاملImproving the Sequence Ontology terminology for genomic variant annotation
BACKGROUND The Genome Variant Format (GVF) uses the Sequence Ontology (SO) to enable detailed annotation of sequence variation. The annotation includes SO terms for the type of sequence alteration, the genomic features that are changed and the effect of the alteration. The SO maintains and updates the specification and provides the underlying ontologicial structure. METHODS A requirements ana...
متن کاملSystematic analysis and functional annotation of variations in the genome of an Indian individual.
Whole genome sequencing of personal genomes has revealed a large repertoire of genomic variations and has provided a rich template for identification of common and rare variants in genomes in addition to understanding the genetic basis of diseases. The widespread application of personal genome sequencing in clinical settings for predictive and preventive medicine has been limited due to the lac...
متن کاملVerdant: automated annotation, alignment and phylogenetic analysis of whole chloroplast genomes
MOTIVATION Chloroplast genomes are now produced in the hundreds for angiosperm phylogenetics projects, but current methods for annotation, alignment and tree estimation still require some manual intervention reducing throughput and increasing analysis time for large chloroplast systematics projects. RESULTS Verdant is a web-based software suite and database built to take advantage a novel ann...
متن کامل